Goto

Collaborating Authors

 high-dimensional state


Review for NeurIPS paper: Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Neural Information Processing Systems

Additional Feedback: This paper introduces a method for efficient exploration in RL. The proposed method assumes an MDP with high-dimensional states that are generated by an underlying lower-dimensional process, such that these states can be compressed via an unsupervised learning algorithm/oracle. The method then (1) defines an MDP over the resulting low-dimensional state space; and (2) learns a policy by generating trajectories in low-dimensional space, which arguably facilitates exploration. At each iteration, the algorithm gathers data to compute a policy and also to improve the embedding model computed by the unsupervised algorithm. The authors show that as long as the unsupervised algorithm and the tabular RL algorithm have polynomial sample complexity, it is possible to find a near-optimal policy with polynomial complexity in the number of latent states, which is much smaller than the number of high-dimensional states.


Path Planning with Adaptive Dimensionality

Gochev, Kalin (University of Pennsylvania) | Cohen, Benjamin (University of Pennsylvania) | Butzke, Jonathan (University of Pennsylvania) | Safonova, Alla (University of Pennsylvania) | Likhachev, Maxim (Carnegie Mellon University)

AAAI Conferences

Path planning quickly becomes computationally hard as the dimensionality of the state-space increases. In this paper, we present a planning algorithm intended to speed up path planning for high-dimensional state-spaces such as robotic arms. The idea behind this work is that while planning in a high-dimensional state-space is often necessary to ensure the feasibilityof the resulting path, large portions of the path have a lower-dimensional structure. Based on this observation, our algorithm iteratively constructs a state-space of an adaptive dimensionality--a state-space that is high-dimensional only where the higher dimensionality is absolutely necessary for finding a feasible path. This often reduces drastically the size of the state-space, and as a result, the planning time and memory requirements. Analytically, we show that our method is complete and is guaranteed to find a solution if one exists, within a specified suboptimality bound. Experimentally, we apply the approach to 3D vehicle navigation (x, y, heading), and to a 7 DOF robotic arm on the Willow Garage’s PR2 robot. The results from our experiments suggest that ourmethod can be substantially faster than some of the state-of-the-art planning algorithms optimized for those tasks.